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ABSTRACT 

Though Fourier Transforms (FTs) are a common technique for finding correlation 
functions, they are not typically used in computations of the anisotropy of the two- 
point correlation function (2PCF) about the line of sight in wide-angle surveys because 
the line-of-sight direction is not constant on the Cartesian grid. Here we show how 
FTs can be used to compute the multipole moments of the anisotropic 2PCF. We also 
show how FTs can be used to accelerate the 3PCF algorithm of Slepian & Eisenstein 
(2015). In both cases, these FT methods allow one to avoid the computational cost 
of pair counting, which scales as the square of the number density of objects in the 
survey. With the upcoming large datasets of DESI, Euclid, and LSST, FT techniques 
will therefore offer an important complement to simple pair or triplet counts. 

Key words: cosmology: large-scale structnre of Universe, methods: data analysis, 
statistical 


1 INTRODUCTION 

In studying the large-scale clustering of galaxies, we com¬ 
monly use the two and three-point correlation functions ^ 
(2PCF) and C, (3PCF) (Peebles 1980; Bernardeau et al. 2002; 
Szapudi 2005 for reviews; recent observational work on the 
2PCF is Anderson et al. 2014; on the 3PCF, Kayo et al. 2004; 
McBride et al. 2011a, b; Guo et al. 2014). These are corre¬ 
lations of the fractional overdensity field (5(x) = p(x)/p— 1, 
where p(x) is the density field and p is the mean density. 
Also commonly used is the anisotropic 2PCF ^aniso, often 
written as multipole moments of the 2PCF with respect to 
the line of sight (Cabre & Gaztanaga 2009; Okumura & Jing 
2011; Reid et al. 2012; Samushia, Percival & Raccanelli 2012; 
Chuang & Wang 2012; Chuang & Wang 2013a, b; Ghuang 
et al. 2013; Sanchez et al. 2013; Xu et al. 2013; Ross, Per¬ 
cival & Manera 2015; White et al. 2015 for recent modeling 
and observational work). Typically, direct counting is used 
to compute ^aniso, and C,. The calculation of ^ and ^aniso 
scales as AnUmax, while that of scales as A(nUmax)^, where 
N is the number of galaxies in the survey, n is the number 
density, and Umax is the volume of a sphere of the maximum 
radius out to which the correlation is measured. For surveys 
such as the Sloan Digital Sky Survey (SDSS) (Alam et al. 
2015 for latest release), with of order one million galaxies, 
and upcoming efforts like Euclid (Laureijs et al. 2011), Large 
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Synoptic Survey Telescope (LSST) (LSST Dark Energy Sci¬ 
ence Collaboration 2012), Dark Energy Spectroscopic In¬ 
strument (DESI) (Levi et al. 2013), and WFIRST (Spergel 
et al. 2013) (see also Jain et al. 2015), with tens of mil¬ 
lions to billions of galaxies, these scalings render the 2PCF 
computationally expensive and the 3PCE computationally 
prohibitive for large Umax. Therefore, techniques that scale 
more favorably with number and number density of objects 
merit consideration. In this work, we argue for the utility 
of Fourier methods in measuring the two and three-point 
correlation functions, as their computational cost depends 
primarily on the number of grid cells Ng into which the 
survey volume is discretized (though gridding the density 
field depends linearly on the number of galaxies). While the 
methods will introduce artifacts due to the grid resolution, 
the balance of performance versus accuracy may be desirable 
for some applications, for example the analysis of large-scale 
correlations in large numbers of mock catalogs. 

It has long been known that the 2PCF can be computed 
using FTs, but, for reasons we detail further below (Section 
this has not been the favored approach. It has been less 
appreciated that the anisotropic 2PGF and 3PCF can also 
be computed using FTs, though they have been used for the 
projected 3PCF (Zheng 2004) and the 3PCF of the Cos¬ 
mic Microwave Background (Chen & Szapudi 2005), as well 
as multipole moments of the power spectrum (Bianchi et 
al. 2015; Scoccimarro 2015) and bispectrum (Scoccimarro 
2015). Though this will not be our focus here, FTs are stan¬ 
dardly used to compute the power spectrum, typically with 
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FKP weighting (Feldman, Kaiser & Peacock 1994). A recent 
implementation is discussed in Anderson et al. (2014) and 
Percival et al. (2014). 

In this paper, we begin by showing that several per¬ 
ceived disadvantages of using FTs for the standard 2PCF 
can be avoided with a particular way of setting up the 
Landy-Szalay estimator (Section [2]). We then show that 
the anisotropic 2PCF can also be straightforwardly handled 
with FTs (Section |3|). In Section lU we show that the 3PCF 
algorithm recently presented in Slepian & Eisenstein (2015; 
hereafter SE15), which already offers roughly a factor of 500 
speed improvement relative to naive triplet counting, can in 
many circumstances be further accelerated using FTs. Sec¬ 
tion [5] concludes. 

2 PAIR COUNTING WITH FOURIER 

TRANSFORMS 

2.1 Estimator and passage to the FT 

It is well-known that the 2PCF of a field can be computed 
quickly using the FT: it is simply the convolution of the den¬ 
sity held with itself, and hence by the Convolution Theorem 
reduces to multiplication in Fourier space: 

^(r) = J dx <5(x)<5(x + r) = y ^^^|5(k)|^e"”‘‘ '' (1) 

where here and throughout J(k) = f dx is the FT 

of the held. Numerically, the Integrals become sums over 
wavenumbers quantized to ht in the hnite box on which the 
integral is evaluated, and for periodic boundary conditions 
the largest wavelength is simply the periodicity. For a non¬ 
periodic survey, we can still use this method by expanding 
the grid to be well larger than the survey and setting <5 = 0 
outside the survey volume. 

However, this is not usually the method employed in 
cosmology, particularly in wide-held surveys. Instead, one 
typically uses an explicit counting of pairs, such as the 
Landy-Szalay (1993) estimator, 

USl = — = J dridrs 6l(ri - ra; S)A(ri)fV(r2) 

BB f dridr 2 d(ri — V 2 \ S)i?(ri)R(r 2 ) ’ 

with N = {D — B), D the data, B the random counts, and 9 
a binning function that is non-zero where its hrst argument 
is within a three-dimensional volume labeled by S. In this 
Section we focus on bins of hxed Cartesian separation. The 
random catalog is chosen to be a Monte Carlo realization 
of the mean density held, with weights such that the mean 
densities of the randoms and the data match. Changing vari¬ 
ables to the separation s = r 2 — ri, we have 

^ _ / ds 6)(s; S) / driA(ri)A(ri + s) 

f ds 9{s-, S) f driB(ri)B(ri + s) 
/dse(s;S)[jV*jV](s) 
fds9(s;S)[B*B](s)' 

Thus the binned estimator is simply a binned convolu¬ 
tion, and the convolution can be evaluated with FTs in 
0{Ng log Ng) time (Cooley & Tukey 1965; Press et al. 2007). 
Forming the gridded density held the FTs require is linear 
in the number of particles (e.g., using triangular cloud-in¬ 
cell interpolation (Hockney & Eastwood 1981; Jing 2005)), 


so we avoid any quadratic scaling with the galaxy density. 
Therefore, in principle the Landy-Szalay estimator can be 
straightforwardly evaluated using FTs. 

2.2 Comparing FTs and pair counting 

We now discuss the practical disadvantages and advantages 
of pair counting relative to a Fourier approach. Ultimately 
we will show that several perceived issues with the FT 
method that seemed to make pair counting more favorable 
can be resolved by the approach of this paper. 

Relative to the Fourier method, pair counting has the 
disadvantage that the work scales as the square of the num¬ 
ber of points. It is a simple optimization, using trees or grids, 
to avoid working with pairs that are much further sepa¬ 
rated than the maximum scale of interest. However for work 
on large scales, the large number of pairs can be computa¬ 
tionally burdensome. This is particularly true because the 
number of randoms typically should be much larger than 
(of order 100 times) the number of data points. The com¬ 
putational expense can be somewhat reduced by using tree 
methods that aggregate many points into cells that are then 
added to the count of a given separation bin as a unit. How¬ 
ever, this only helps if the binning is substantially coarser 
than the interparticle spacing, which often is not the case in 
practical galaxy surveys. 

Pair counting does have some important advantages. 
First, it avoids any gridding of the data. Gridding results in 
small displacements of the effective particle positions, which 
in turn produce correlation results that are smoothed ver¬ 
sions of the true answer (see Jing 2005 for discussion of the 
effect in the power spectrum and how it can be ameliorated). 
Cartesian gridding can also cause artifacts when summing 
over separation bins in spherical coordinates. Decreasing the 
grid spacing can decrease these biases, but at an increased 
computational cost. Second, pair counting allows easy com¬ 
putation of the anisotropy of the correlations relative to the 
line of sight, since in real space the orientation of each pair 
to the line of sight is clear. Fourier methods, on the other 
hand, use a Cartesian basis that treats positions in the sur¬ 
vey without preference as to orientiation to the line of sight, 
appearing to destroy this information. Third, pair counting 
produces an unbiased estimate of the correlation function 
regardless of the survey geometry. In contrast, an FT-based 
convolution of the 5 field yields a misnormalized result due 
to the zero-padding outside of the survey volume. 

In this work, we show that the last two of these problems 
can be easily avoided when using FTs. Here, we discuss the 
zero-padding issue, deferring the anisotropic correlations to 
Section |3l Equation ((2| shows that there is no need to form 
the 5 field. The zero padding of the grid beyond the survey 
boundary here is of no consequence because it will enter both 
the numerator and denominator of equation ([2]) and hence 
cancel out. Thus the value of ^ should be the same, up to 
grid smoothing, whether one has used pair counting or the 
FT. Consequently any advantage of the Landy-Szalay com¬ 
putation of the monopole of the correlation function, e.g., 
as regards the integral constraint (Coil 2012 for definition), 
will remain. 

Finally, we note that there is a common misunderstand¬ 
ing that for non-periodic volumes one must embed in a pe¬ 
riodic domain that is twice the size of the original survey. 
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This is not true if one is only interested in a limited range 
of separations s. One need only use a periodic embedding 
large enough that the separation between any point in the 
survey and any point in the periodic duplicate is larger than 


3 ANISOTROPIC CORRELATIONS 

We now turn to the anisotropic 2PCF, described in terms 
of its multipole moments The anisotropies of the corre¬ 
lation functions, and most importantly the quadrupole ^ 2 , 
have important cosmological purpose for the measurement 
of the Alcock-Paczynski effect (Alcock & Paczynski 1979), 
redshift-space distortions (Kaiser 1987; Hamilton 1998 for a 
review), and anisotropic baryon acoustic oscillation (BAO) 
signature (Gaztanaga, Cabre & Hui 2009). 

Here the multipole moments are with respect to the 
angle between the pair separation s and the line of sight 
n = (ri -|- r2)/2, and we define /r = s ■ n. We first write ^aniso 
as a function of bin S in separation magnitude s = |s| and 

fi-. 

Uiso(5,^) = ^1^, (4) 

where Af and TZ respectively denote NN and RR. We ex¬ 
pand ^aniso, A/”, and TZ as multipole series: 

00 00 

Caniso(S,/x) = ^C^(S)P,(m), Af{S,n) = ^Affc(S)Pfc(^), 
^=0 k=0 

00 

P(5,M) = ^P,(r)P,(^). (5) 

1=0 

Pe is a Legendre polynomial, and as usual, these relations 
imply 5^(5') = [(2^-|-l)/2] f d^Pr(/i)^aniso(S',/i) and similarly 
for Afc(S') and Rj{S). Multiplying equation (|3| through by 
TZ and then inserting equations ([51), we find 

J2^i{s)nAS)Pi{n)Pj{n) = ^A4(5)Pfc(/i). (6) 

ij k 

Using a linearization formula for the product of two Legen¬ 
dre polynomials we obtain, with arguments suppressed, 

J2^eTZj{2j + l)( Q J j)p, =^A4Pfc. (7) 

ijq ^ 2 f. 

The Wigner 3j-symbol above describes angular momentum 
coupling. Using orthogonality, separating out the j = 0 
term, dividing through by TZo, and defining fj = TZjjTZo, 
we obtain 

t = ( 8 ) 

^ i j>0 ^ ' 

Defining a multipole coupling matrix M with elements 

M,, = (2fc + l)^( J [ M'/, (9) 

i>o ^ ^ 

we see that equation can be written as 

A/■/7^0 = (I + M)^1niso = Aelniso, (10) 

where N = (A/o,A/i, • • ■ , A/£„„) and analogously for ^aniso. 


The edge-correct ion equation can then be solved by 
matrix inversion. Formally we need all multipoles of the 
randoms TZj to obtain the solution, but in practice the /,■ 
should fall off so quickly that measuring only out to some 
^max and using this to invert a truncated, finite-dimensional 
sub-matrix of M should suffice. For more detailed discussion 
of similar issues arising in the 3PCF, see SE15 Section 4.2. 
Note also that if we are computing an auto-correlation, then 
by parity all odd-order coefficients in equation ([S]) vanish. 

With equation (USD for the vector of multipole coeffi¬ 
cients ^aniso in hand, our task now becomes measuring the 
A4 and TZj it requires. We restrict to the case where the 
pairs project to only a small angle on the sky; for discussion 
of wide-angle effects, see Samushia, Percival & Raccanelli 
(2012) and Raccanelli et al. (2013). In this limit, we then 
approximate /r = s ■ n by s ■ ri, i.e., approximating that the 
fine of sight to the pair is very nearly the line of sight to one 
member (Yamamoto et al. 2006). We write 

A4(5') =(2fc -I- 1) J s^ds9{s;S) J r\dri 

X J dflsdflriPk{s ■ ri)N{ri)N{ri + s) (11) 

and analogously for TZj. The integral over dri averages over 
translations. Consolidating integrals, we find 

J\fk{s) = {2k + 1) j ds6{s;r) j dn Pk{s ■ ri)N{ri)N{ri + s). 

( 12 ) 

We now show how to cast the inner integral as a convolu¬ 
tion, which will then permit its evaluation via FTs. Using 
the spherical harmonic addition theorem (Arfken, Weber & 
Harris 2013, hereafter AWH13, equation 16.57) the integral 
becomes 

k p 

A4(s) = 47r ^ / ds6{s-,S) 

m= — k 

X j driYkm{s)Y^m{ri)N{ri)N{ri+s) 

k p 

= y rfse(s;S)n^(s)[(Ny;^)*N](s). (13) 

The approach here generalizes to any separable kernel in¬ 
serted in place of the Pi above. 

With the problem thus set up as a convolution, to com¬ 
pute a particular multipole A4 we will need 2^ -f 2 real for¬ 
ward FTs, one for the data minus randoms, N, and then 
2^ -f 1 for the independent components of the NYim- We 
then need 2^ -|- 2 real inverse transforms after taking the 
products in Fourier space. Note that once the convolution 
is computed, we can perform all of the integrals over the 
binning as needed. Further, we can separate the real and 
imaginary components of the spherical harmonics and com¬ 
pute all of these 2^ -|- 1 terms sequentially, which allows us 
to store only 3 copies of the full grid at a time {N and its 
FT, plus the working space for the convolution), while ac¬ 
cumulating the resulting contributions to A4. 

The computation of TZj proceeds identically. However, 
we note that because the TZ pair count does not involve a 
near-cancellation as A/" = {D — R)^ does, one can use a 
substantially smaller random catalog when computing TZj. 
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Inspecting equation (11) shows that TZo appears as a nor¬ 
malization of the correlation function, while the fj ratios 
only slightly mix terms. Moreover, when computing repeat¬ 
edly on large numbers of mock catalogs, one would generally 
not need to repeat the computation of TZ for each. 


4 3PCF WITH FOURIER TRANSFORMS 

We now show how to use FTs to accelerate the algorithm 
for measuring multipole moments of the 3PCF presented in 
SE15. Note that here we consider only the isotropic 3PCF 
and do not track orientiation to the line of sight. We first 
recall that this algorithm measures the binned radial coefh- 
cients in an expansion of the 3PCF as 

aSi,S2-,ii -^2) = Y^Ci{Si,S2)Pe{si -52). (14) 

e 

The algorithm exploits the spherical harmonic addition the¬ 
orem to break the Legendre polynomial into a separated sum 
of spherical harmonics and then obtains the spherical har¬ 
monic coefhcients a£m(5;x) in each radial bin (designated 
by S) and around each galaxy in the survey (with position 
x). These are combined locally about each galaxy to form 
the binned estimate about that origin, 

1 * 

Cr(5'i,S'2;x) = —(5(x) ^ a^^(S'i; x)aJ^(S' 2 ; x) (15) 

m = — i 

(SE15 equation 15), and finally averaged over all possible 
origins x to find 

Cr(S'i,S2) J dx6(5'i,S'2;x) (16) 

(SE15 equation 12). 

Here we show that computing the local a(rn{S; x) is sim¬ 
ply a convolution and so can be accelerated with FTs. We 
begin with SE15 equation 14 for the aem about a particular 
origin of an arbitrary density field <5: 

aem{S;x) ^ j dO! j r'^dr'6l(|r'|; S')(5(r'+ x). 

(17) 

Consolidating integrals, this becomes 

a£m(S';x) = y dr'Yft„(f')6l(|r'|;S)(l(r'-|-x), (18) 

which clearly has the form of a convolution. Hence by the 
Convolution Theorem 

ar™(S;x) = Fp-^ {Ar™(k; S)5(k)} (x), (19) 

where again (5(k) is the FT of the density field d(r') and 
S) is the FT of the kernel 

Ke^iv'-S) = Y,Ur')ei\r'\-S). ( 20 ) 

Thus where in SE15 we needed an 0{nVraax) operation 
about each possible origin x to compute each aim for a given 
radial bin, and hence 0{NnVniax) operations total for each 
radial bin, we now simply need Ng log Ng operations total to 
compute each aem in a given radial bln and for all origins. 

In detail, in SE15 the Q are expressed in terms of mul¬ 
tipole moments of the NNN and RRR fields (see SE15 Sec¬ 
tion 4), which can be obtained using equations (I17I) - (I20I) 


with S = NNN and then S — RRR successively. If we want 
the multipole coefhcients of, e.g., NNN, up to £max in Abins 
radial bins, we need one real forward FT of the density held, 
Abins(^max + 1)^ real forward transforms for the kernels Kem, 
and hnally this same number of real inverse transforms after 
taking products in Fourier space. The same holds for RRR. 

The kernel Kem is simple and so its forward FTs can 
be done analytically, essentially halving the total number of 
FTs required. We have 

A,^(k; S)^ j dv'(^'^-'^'e{\v'\-S)Yem{i') 

= {Aie)fY;m{k)Uk-S). ( 21 ) 

We expanded the plane wave using AWH13 equation 16.61, 
performed the angular integral by orthogonality, and dehned 

je{k;S)= j u^duje{ku)9{u;S). (22) 

Analytically evaluating the kernel’s FTs is not always favor¬ 
able. Doing the FTs all numerically treats data and kernel on 
the same footing: both will be gridded using, e.g., cloud-in- 
cell and then transformed. If one computes Kem analytically, 
however, one must then grid in Fourier space so as to match 
the gridded, transformed data. Transforming and then grid- 
ding only reduces to gridding and then transforming in the 
limit of a very hne grid. Otherwise we expect this reordering 
of operations might introduce additional artifacts. However 
it is precisely in applications where the grid is extremely fine 
that the FT will take longest and so imply the greatest need 
to reduce the number of transforms by analytic evaluation 
of Kem- 


5 DISCUSSION AND CONCLUSIONS 

We have presented Fourier Transform methods for the com¬ 
putation of the 2PCF, anisotropic 2PCF, and 3PCF. For 
the 2PCF, we have shown that the familiar Landy-Szalay 
estimator can be immediately translated into an FT compu¬ 
tation. We then show that the multipoles of the anisotropic 
2PCF can be computed by FTs, despite the curvature of the 
sky relative to the Cartesian grid. After computing the mul¬ 
tipoles of the NN numerator and RR denominator, one can 
easily transform to the multipoles of ^aniso. For the 3PCF, 
we have shown that the SE15 estimator for the Legendre 
decomposition of the 3PCF can be computed with FTs. 

In all cases, the resulting algorithm scales only linearly 
with the number of survey objects (or random samples) and 
only 0{Ng In Ng) with the grid size. For some important ap¬ 
plications, this is faster than the 0{NnVR^^,^) scaling of the 
pair-counting methods (and the SE15 3PCF method). The 
speed advantages are maximized when one considers larger 
separations, higher number densities, and coarser grids. 

However, Fourier methods do introduce artifacts due to 
grid resolution, the level of which will depend on the ra¬ 
tio of the grid spacing to the radial bin width being used. 
For example, in the BAO analysis of Anderson et al. (2014), 
2PCF separation bins of Mpc were used. One would 
then want the FT grid to be comfortably smaller than this. 
It is worth noting that the FT artifacts would be reduced 
If one used radial separation bins with smoother edges in¬ 
stead of the traditional tophats. Smoother bins are accept¬ 
able for science applications such as BAO, which is a smooth 
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feature itself, and indeed are numerically more stable for 
the spherical Bessel transforms needed to form model 2PCF 
from theoretical power spectra. One might proceed by tun¬ 
ing the bins to have negligible support beyond wavenum¬ 
ber k « 0 . 4/1 Mpc“^, where the acoustic oscillations have 
been damped away, while choosing an FT grid scale of 3- 
Ah~^ Mpc so as to place the Nyquist frequency comfortably 
above that smoothing scale. 

We expect that an important application of these meth¬ 
ods is to the calculation of correlation functions from mock 
catalogs. The computation of covariance matrices now is 
commonly performed by repeating the calculation on hun¬ 
dreds or thousands of mock catalogs. This dominates the 
computational effort of the cosmology analysis. But it is a 
place where making a mild sacrifice in the accuracy of the 
pair count may be acceptable to gain the speed FTs offer. An 
interesting aspect of these FT approximations is that they 
should converge to the pair-counting answer as the grid size 
increases. One might opt to compute mocks with a reason¬ 
ably efficient grid, accepting an error that is smalf compared 
to the survey variations, white stilt processing the actuaf 
data with a finer grid or an expiicit pair-counting code. 

While we were preparing this work for submission, 
Bianchi et al. (2015) and Scoccimarro (2015) posted pre¬ 
prints suggesting use of FTs to compute the anisotropies of 
the large-scale power spectrum about the line-of-sight, us¬ 
ing respectively the Yamamoto et al. (2006) estimator and 
a newly constructed estimator. Scoccimarro also uses FTs 
to estimate the redshift-space bispectrum. The core math¬ 
ematical approach of these works is the same as what we 
present in Section [3] for the anisotropic 2PCF. In detail, 
they present their results as polynomial expansions of Leg¬ 
endre polynomials, whereas we expand in spherical harmon¬ 
ics. The spherical harmonics can be directly translated to 
polynomials when computing (SE15 Section 2), and exten¬ 
sions to higher multipoles are likely more convenient to track 
with Ytm- We note that the very large computational advan¬ 
tage reported in Bianchi et al. (2015) is specific to the power 
spectrum, which otherwise required summing over aif pairs 
of survey objects. For the 2PCF, common methods oniy need 
to count pairs within Knax, so the FT advantage over pair 
counting is more modest for reafistic grid sizes. 

The coming generation of large galaxy surveys will 
stress our computational resources not simply because of 
the survey size but also because of the drive for increasing 
analysis accuracy, which manifests itself in larger numbers 
of mock catalogs and analysis variations. We believe that 
Fourier methods such as those presented here offer an impor¬ 
tant means of enhancing the computational speed of future 
cosmological analyses. 
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